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Abstract. The latest cosmological data seem to indicate a significant deviation from 
scale invariance of the primordial power spectrum when parameterized either by a 
power law or by a spectral index with non-zero "running" . This deviation, by itself, 
serves as a powerful tool to discriminate among theories for the origin of cosmological 
structures such as inflationary models. Here, we use a minimally-parametric smoothing 
spline technique to reconstruct the shape of the primordial power spectrum. This 
technique is well-suited to search for smooth features in the primordial power spectrum 
such as deviations from scale invariance or a running spectral index, although it would 
recover sharp features of high statistical significance. We use the WMAP 3 year results 
in combination with data from a suite of higher resolution CMB experiments (including 
the latest ACBAR 2008 release), as well as large-scale structure data from SDSS 
and 2dFGRS. We employ cross-validation to assess, using the data themselves, the 
optimal amount of smoothness in the primordial power spectrum consistent with the 
data. This minimally-parametric reconstruction supports the evidence for a power 
law primordial power spectrum with a red tilt, but not for deviations from a power 
law power spectrum. Smooth variations in the primordial power spectrum are not 
significantly degenerate with the other cosmological parameters. 
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1. Introduction 

Under simple hypotheses for the shape of the primordial power spectrum, cosmological 
parameters have been measured with exquisite precision from the Wilkinson Microwave 
Anisotropy Probe (WMAP) data alone [H E] or in combination with higher-resolution 
cosmic microwave background (CMB) experiments [31 IU E] and large scale structure 
survey data [TJ [8] . 

Observations indicate that the primordial power spectrum is consistent with being 
almost purely adiabatic and close to scale invariant, in agreement with expectations 
from the simplest inflationary models. Indeed, a power law primordial power spectrum 
fits both CMB and galaxy survey data very well. Different models for the generation 
of primordial perturbations yield different deviations from a purely scale invariant 
spectrum. The simplest can be described in terms of power laws (as e.g., in the 
simplest slow- roll inflationary models), or a small scale-dependence ("running") of the 
spectral index (also in principle arising in inflationary models; see e.g. [HI Ell E] 
for the implications of the current constraints on this parameter). However, other 
forms of deviations have been considered: for example, a broken power law [321 [13] , 
an exponential cutoff at large scales [El [151 E], harmonic wiggles superimposed 
upon a power law arising, for example, from features in the inflaton potential 
P21 EH CEE1 QSl EQl EH E2], transplanckian physics [231 El], multiple inflation [25], or 
"stringy" effects in brane inflation models [26J. 

The statistical significance of such deviations from simple scale invariance is 
often difficult to interpret [271 EH ESI [301 EI]- m addition, the significance of these 
deviations depends on several factors: assumptions about instantaneous reionization, 
the treatment of beam uncertanties, treatment of a possible SZ contribution [2], point 
source subtraction [32], [33] , the low multipole CMB angular power spectrum (Ci) and 
likelihood calculation [31] . 

Here, we use a minimally-parametric reconstruction of the primordial power 
spectrum based on that presented in [35] . This reconstruction will enable one to answer 
questions such as: does the signal for the deviation from scale invariance, or deviation 
from a power-law behavior, come from a localized region in wavelength, or from all 
scales? In the first case it would be an indication that the assumed functional form 
is not the correct description of the data. In addition, such an analysis could offer 
a clue about what could be driving the signal and what systematic effects may most 
affect the detection. For example, a signal arising only at high multipoles £ in the 
WMAP data could point to incorrect noise, beam or point source characterizations. A 
deviation arising only from the largest observable modes (lowest wavenumber k) could, 
for example, point to foregrounds, the description of low £ statistics, or assumptions 
about reionization. As is the case with all non-parametric methods, this approach, not 
relying on estimates of parameters and their uncertainties, has the drawback that it 
cannot provide a straightforward measurement and a confidence interval. Nevertheless 
we will discuss how to interpret our results and compare them with parametric methods. 
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Non-parametric or minimally-parametric reconstruction of the primordial power 
spectrum has only become possible recently, as cosmological datasets now provide 
enough signal-to-noise to go beyond simple parameter fitting and to explore "model 
selection". While we cannot directly measure the primordial power spectrum, 
observations of the CMB offer a window into the primordial perturbations at the 
largest scales. Large-scale structure (LSS) observations now overlap with scales accessed 
by CMB observations, and extend the measured k range to smaller scales than the 
CMB. Both CMB and LSS power spectra depend on the primordial power spectrum 
via a convolution with a non-linear transfer function which, in turn, depends on the 
cosmological parameters. In addition, large-scale structure data are affected by galaxy 
bias and by non-linear effects, which need to be modeled as outlined below. 

Typically, when fitting a model to data, the best fit parameters are found by 
minimizing the "distance"^ of the model to the data. But when recovering a continuous 
function (such as the primordial power spectrum) from discrete data (the CMB Ci or 
the bandpowers for LSS), there are potentially infinite degrees of freedom and finite 
data points. Thus it is always possible to find at least one function that interpolates 
the data and has zero or nearly zero "distance" from the data. However, as the data 
are noisy, such an interpolation will display features created by the noise (and cosmic 
variance for cosmological applications) that are not in the true underlying function. On 
the other hand, if using too few parameters (or the wrong choice of parameterization), 
the fit could miss real underlying features. Non-parametric or minimally-parametric 
inference aims at identifying, from the data themselves, how many degrees of freedom 
are needed to recover the signal without fitting the noise. 

Previous work on minimally-parametric reconstruction has employed bins [36l 12]. 
piecewise linear reconstruction [37] or a combination [HI [15]. Purely non-parametric 
techniques involving transfer function deconvolution to directly re-create the power 
spectrum [381 ESI SQl HH H2], as is the case for all non-parametric methods, may show 
a tendency for the recovered function to "fit the noise". Wavelets [13] and principal 
components [S] provide rigorous non-parametric methods to search for sharp features 
as well as trends in the power spectrum. Some of the techniques presented here, such 
as cross-validation, could be useful in choosing the number of basis functions to use in 
these methods. 

This paper is organized as follows. First, we build upon the spline reconstruction 
technique presented in [35J, which we briefly review in § 2. In § 3 we apply the method 
to WMAP third year data (WMAP3) alone and in combination with higher resolution 
CMB experiments and LSS data. We present our conclusions in § 4. 

2. Smoothing Spline and Penalized Likelihood 

Since the simplest inflationary models, which are consistent with the data, predict the 
primordial power spectrum to be a smooth function, we search for smooth deviations 

| For example, the chi-square is a distance weighted by the errors. 
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from scale invariance with a cubic smoothing spline technique. Here we briefly review 
refs. jl5] and [35J. Spline techniques are used to recover a function f(x) based on 
measurements of / denoted by / at n discrete points X{. Values at N "knots" of x 
are chosen. The values of F (the spline function) at the N knots uniquely define the 
piecewise cubic spline once we ask for continuity of F(x), its first and second derivative 
at the knots, and two boundary conditions. We choose to require the second derivative 
to vanish at the exterior knots. 

Allowing infinite freedom to the knot values and simply minimizing the chi-square 
will tend in general to fit features created by the random noise present in the data. It 
is therefore customary to add a roughness penalty which we chose to be the integral of 
the second derivative of the spline functional 

n 

S(F) = J2 [F(*i) ~ fai) 

i,j = l 

where denotes the data covariance matrix, and A is the smoothing parameter. 
The roughness penalty effectively reduces the degrees of freedom, disfavouring jagged 
functions that "fit the noise". As A goes to infinity, one effectively implements linear 
regression; as A goes to zero one is interpolating. It can be shown that for this 
functional form of the penalty function, the cubic spline is the function that minimizes 
the roughness penalty for given values of F(xk) at the knots Xk- 

The number of knots is usually chosen and kept fixed in the analysis. We choose to 
use 5-6 knots: the dimensionality of the problem grows with the number of knots, thus 
this corresponds to a 9- 10- dimensional problem. Beyond a minimum number of knots, 
there is a trade-off between the number of knots and the penalty, and the form of the 
reconstructed function does not depend significantly on the number of knots after this 
minimum number is reached. As the main goal of this work is to explore, in a minimally 
parametric way, smooth deviations from scale invariance (e.g., a red tilt or a running), 
a few (~ 3) knots are sufficient. 

In generic applications of smoothing splines, cross-validation is a rigorous statistical 
technique for choosing the optimal smoothing parameter. Cross-validation (CV) 
quantifies the notion that if the underlying function has been correctly recovered, it 
should accurately predict new, independent data. The most rigorous (but also more 
computationally expensive) form of CV is refered to as "leave-one-out" CV: the analysis 
is carried out leaving one data point out, then the distance between the recovered 
function and that data point is computed and stored. This is repeated for each data 
point and then the sum of the resulting distances, the "CV score" , is computed. Finally, 
the best penalty A is the one that minimizes the sum, i.e. the CV score. 

fe-fold cross-validation follows a similar procedure but splits the sample into k 
subsamples. It becomes identical to "leave-one-out" CV when k is equal to the number 
of data points n. For k <C n this CV technique becomes increasingly faster. In many 

§ Note that for a twice continuously differentiable function such as our cubic spline, this is a measure 
of its total curvature. 
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cases, for example when the measurements give a direct estimate of / in Eq. ([!]), the 
calculation of "leave-one-out" CV can be significantly shortened [15]. In cosmological 
applications, however, the observable quantities and the primordial power spectrum 
are connected by a convolution with the transfer function, making the short-cut not 
applicable. To make the problem computationally manageable, we opt for a n/2-fold 
cross-validation. 

3. Implementation of Spline Reconstruction 

We consider the following datasets: WMAP three year temperature and polarization 
power spectra (WMAP3) jSlHSJHTj; Cosmic Background Imager temperature data [I] 
(CBI); BOOMERanG [6]; Very Small Array temperature power spectrum (5] (VSA); 
Arcminute Bolometer Array Receiver temperature power spectrum [481 Ej (ACBAR); 
[4"9] (ACBAR08); the galaxy power spectrum from the SDSS main sample [7]; the 
Anglo-Australian Two Degree Field galaxy redshift survey [8] (2dFGRS) and the power 
spectrum from the SDSS DR4 luminous red galaxy sample (LRG) [50] . 

In particular we consider WMAP3 data alone, and in combination with either higher 
resolution CMB experiments or large-scale structure data. 

In this application, Eq. ([I]) becomes: 



where C (Data|a, P (k)) denotes the likelihood of the data (Ce bandpowers, or 
bandpowers of the galaxy power spectrum), given the cosmological parameters {a} and 
the primordial power spectrum P(k). In this approach P(k) is fully determined by its 
values at the knots. In other words, as the function to be reconstructed in a minimally 
parametric way with the spline approach is P{k), the penalty function following e.g., 
[43] should be its second derivative P"(k). Another possibility would have been to 
parameterize the primordial power spectrum as oc fc n w and to reconstruct the function 
n(k); in this case the penalty function would have been different, but an underlying 
assumption on the form of the primordial power spectrum would have been made. 

We use 5 knots for the WMAP3 data when considered alone and 6 knots when 
in combination with other datasets. As explained above, our main goal is to explore 
smooth deviations from scale invariance and thus a few (~ 3) knots are sufficient. The 
knot locations are illustrated in Fig. [U We have explored different knot locations and 
found that while the reconstructed form for P(k) does not depend significantly on knot 
locations (as long as the knots sample the full /c-range, see Appendix) the convergence 
speed of the Markov chains does depend on knot location. 

We further develop the implementation of [35J by (a) running new chains for each 
value of the penalty rather than using importance sampling to explore small changes in 
A; (b) varying cosmological parameters when computing CV; (c) adding other datasets 
beyond WMAP3 to the data compilation; and (d) changing the way the CV sample is 
chosen. 
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Figure 1. Left: Triangles show the position of the 5 knots used for WMAP3. Crosses 
show the CV bins (CV1 in red , CV2 in black); the width of the cross shows the 
bin width and the height shows the expected noise (cosmic variance + experimental 
noise) for the fiducial model shown by the solid line. Right: CV set up for higher- 
resolution CMB experiments, the color scheme is as in the left panel. The points 
are the actual data with error-bars. The datasets used are CBI, VSA for CV1 and 
BOOMERanG,VSA for CV2. Throughout C e are in units of fiK 2 . 

To set up CV, the data set is split into two samples (denoted by CV1 and CV2). 
We split WMAP3 data in bins of roughly equal signal-to-noise, as illustrated in Fig. 
HJ In the released WMAP3 v2p2p2 likelihood package, the low £ likelihood (£ < 32) is 
computed using a pixel-based method, and thus E's from 2 to 32 must belong to the 
same CV bin. This sets the CV bin size: all the CV bins have roughly the same signal- 
to-noise. With this choice we also minimize the effect of off-diagonal coupling (which 
becomes negligible at large separations in £) . The polarization data (TE and EE) at low 
£ < 23 is always used in implementing CV, as it encodes information on reionization and 
the optical depth parameter r, and not on the shape of the primordial power spectrum. 
We find that this choice greatly reduces the degeneracy between r and the shape of the 
primordial power spectrum on the largest scales in the CV1 runs. Note that for the 
/c-range corresponding to £ < 100, there are 2 knots and 5 CV bins, while in the range 
corresponding to £ > 100, there are 3 knots and 53 CV bins. As each bin has roughly 
the same signal-to-noise, the low £ range is actually sampled by the knots much more 
finely than the high £ range. 

For the remaining datasets we set up CV as follows. For the high resolution CMB 
data, CV1 includes VSA and ACBAR, and CV2 includes CBI and BOOMERanG 
(see Fig. [p. As SDSS bandpowers are essentially uncorrelated, we use every other 
bandpower for CV. 

For each of the CV samples and for a grid of penalty values A, we run a Markov 
Chain Monte Carlo (MCMC), using a suitably modified version of the publicly available 
software CosmoMC J5TJ [52] . The best fit model from CV1 is then run through the CV2 
data sample, the likelihood is stored, and vice versa. For each value of the penalty A, 




Figure 2. Primordial power spectrum P(k) (left) and corresponding spectral index 
n s (k) (right) reconstructed from WMAP3 data alone for the CV-selected optimal 
penalty. The primordial power spectrum shape seems to acquire a curvature at £ > 300. 
The location of the knots is shown on the top of the figure. Throughout, the units of 
k are Mpc -1 . 



the sum of the logarithm of the two likelihoods so obtained is our proxy for the CV 
score. The optimal penalty is the one that maximizes the CV score. Once the optimal 
penalty is found, a MCMC is run for the chosen penalty on using all the data. 

3.1. WMAP 3-year data alone 

We start by considering WMAP3 data alone and use the latest version of the WMAP 
likelihood code (v2p2p2) which includes an updated point-source correction [33], beam 
error propagation and foreground marginalization on large scales. We do not include a 
Sunyaev-Zel'dovich [53] (SZ) contribution to the Cg. 

In Fig. [2] we show the reconstructed power spectrum for the optimal penalty 
(A opti wMAp) and the corresponding spectral index n s {k). Here n s {k) is defined (and 
obtained) by the first derivative of P(k): n s (k) = 1 + al In P/al In k. The light (blue) 
lines are the best fitting 68% and the darker line is the multi-dimensional best fit jj As 
customary in CMB studies, k is in units of Mpc -1 . When interpreting the plot of the 
scale dependence of the spectral slope n s (k), one should keep in mind that the quantity 
that was reconstructed, and for which CV was used to find the optimal penalty, is 
actually the power spectrum; thus the penalty may be sub-optimal for n s (k). 

The large cosmic variance in the low I region allows a lot of freedom in the shape of 
the primordial power spectrum, but makes any downturn at large scales not significant. 
Fig. [2] indicates that the signal for deviation from scale invariance seems to arise from 
i > 400, where a downturn is visible. The inclusion of an SZ contribution would make 
this effect even larger, but is not considered here. Beyond this trend at high £, we do not 

|| In all cases, when plotting the 95% best fitting spline curves, the "envelope" is only slightly larger 
but the individual spline curves are more "wiggly" . For clarity, we show only the 68% best curves. 
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find any evidence for features in the power spectrum. At high £, different systematics 
affect the reconstructed P(k): the beam model, beam error propagation and point 
source subtraction. In particular [32] find that, assuming a power law primordial power 
spectrum, the recovered spectral slope changes depending on the estimated point source 
amplitude and its error, but that beam errors have a small effect. They tentatively argue 
that the point source uncertainty should be increased by 60% compared to the WMAP 
estimated value (which would tend to reduce the significance of a red spectrum). They 
also suggest that the fiducial point source contribution to be subtracted out may be 
smaller by ~ 25% than the WMAP value. We find that, if we use the fiducial point 
source amplitude estimate of |32j, the reconstructed P(k) does not show the high £ 
downturn. Along the same lines, [49] find that there is a tension between the a§ value 
recovered from WMAP3 data alone and that recovered from WMAP3+ ACBAR08 data, 
with WMAP's estimate being lower. They conclude that the lower as value favored by 
WM AP3 alone is driven by WMAP measurements at high £. 

It is interesting to compare this reconstruction with that presented in [35] for the 
first year WMAP data release (WMAP1). A direct comparison of the two studies needs 
to be done with caution. They were interested in deviations from scale invariance; thus 
they report the quantity n(k) = 1 + (In P(k) — In P(k ))/ (In k — In fc ), while here we are 
interested in more general deviations, so we show n s (k). For n s = 1, n = n. We can see 
that WMAPl-based reconstruction shows a similar behavior: a P(k) consistent with 
scale invariance on large scales and a downturn at k > 0.01. But the WMAP3 optimal 
penalty is lower than that for WMAP1 (0.02 vs 0.1 when converted to the same units), 
reflecting the fact that the noise level in WMAP3 is lower (in particular, the error on 
Ci due to noise is a factor ~ 3 lower). 

In Table [1] we report constraints on cosmological parameters to show how they 
are affected by the extra freedom in the primordial power spectrum, along with the 
power law and running spectral index models as reported in [2j. The r determination is 
virtually unaffected by the additional freedom in the primordial power spectrum. This 
was not the case in WMAP1 (see [35J), but this is understandable as, in WMAP3, r is 
well constrained by the EE polarization data alone [47]. 

3.2. WMAP 3-year data and Higher Resolution CMB Experiments 

Following [2], to minimize covariance between WMAP and higher resolution CMB 
experiments, we consider only the following subsets of the data: for ACBAR, only 
bandpowers at £ > 800; for CBI, only bandpowers 5 to 12 (600 < £ < 1800); for 
VSA, 5 band powers with mean ^-values of 894, 995, 1117, 1269 and 1407; and for 
BOOMERanG, 7 bandpowers with central £ > 800. As before, we do not consider an 
SZ contribution to the Ci, but, by not considering band powers at £ > 2000, scales 
possibly affected by the "SZ excess" are excluded from the analysis. 

When combining WMAP3 with higher resolution CMB experiments (WMAPext), 
we find that the optimal penalty (A opti ext) becomes higher (A optjex t = 25A opti wMAp; i>e. 
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Table 1. Effect on cosmological parameters of the extra freedom in the primordial 
power spectrum, for WMAP3 alone. We report only the parameters for which errors 
are affected more than 10%. "PL" means power law power spectrum, "run" means 
running spectral index, "spline A op t,wMAp" means spline reconstruction with optimal 
penalty set by CV, and "spline A = 0" means spline with no penalty. "PL" and "run" 
are taken from [5] . 



WMAP 


PL 


run 


spline A pt !WMA p 


spline A = 


n b h 2 


0.0223 ± 0.00073 


0.021 ±0.001 


0.021 ±0.001 


0.0192 ±0.0012 


n c h 2 


0.1054 ±0.0078 


0.114 ±0.0098 


0.117 ±0.011 


0.141 ±0.018 


h 


0.733 ±0.032 


0.681 ± 0.042 


0.679 ±0.047 


0.584 ±0.058 


o-s 


0.761 ±0.049 


0.77 ±0.05 


0.818 ±0.052 


0.881 ±0.051 



the data do not require as much freedom in the shape of the primordial power spectrum), 
and that CV becomes less sensitive to the value of the penalty. In other words, the CV 
score dependence on penalty flattens out. We interpret this as the recovered P{k) being 
smooth, and its second derivative being small enough to make the total likelihood less 
sensitive to the penalty function. In fact, most P{k) features giving rise to a second 
derivative are localized at low I (small k) where the signal is dominated by cosmic 
variance. As statistical power is added to small scales, the "wiggliness" allowed by 
the large scales gets downweighted. Since the WMAP3 penalty is lower than the one 
found for WMAP1 by [35], one may intuitively expect that adding extra datasets would 
reduce the penalty further. Here this is not the case: first, an extra knot is added and 
the /c-range probed increases by a decade; second, the downturn that was significant in 
WMAP3 data alone is now not as significant. 

Fig. [3] shows the P{k) recovered for the optimal (CV-selected) penalty. Now a 
deviation from scale invariance is clearly visible; the signal is distributed on all scales 
and consistent with a red-tilted power law power spectrum. This is more clearly seen 
in the corresponding dependence on scale of the spectral slope. A scale-independent 
spectral slope and a red tilt is a better fit to the data than a scale invariant power 
spectrum (indicated by the dashed line). 

For comparison, and to visualize the effect of implementing CV, in Fig. H]we show 
the reconstructed power spectrum for penalty set to zero. While one may be tempted 
to interpret the reconstructed power spectrum as having features, CV shows that they 
are not significant. In Table [2] we report the effect on cosmological parameters of the 
extra freedom in the primordial power spectrum. As we have seen previously, r is not 
affected. 

While this work was being completed, the ACBAR collaboration released new 
results and the CMB temperature power spectrum for the complete set of observations 
|49j . These new results greatly improve calibration and the uncertainties on band-powers 
decrease by more than a factor ~ 2. We show here the recovered power spectrum for 
the combination of WMAP3+ ACBAR08 data. We consider only ACBAR bandpowers 
which include 550 < £ < 2100. The lower I cut is motivated by minimization of 
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Figure 3. Primordial power spectrum P(k) (left) and spectral index n s (k) (right) 
reconstructed from WMAPext for the CV-selected optimal penalty. A deviation 
from scale invariance consistent with a red-tilted power law form is clearly visible. 
The dashed line corresponds to a scale invariant power spectrum: the reconstructed 
spectrum is consistent with a scale independent spectral slope and a red tilt. 
Throughout, the units of k are Mpc -1 . 




Figure 4. Primordial power spectrum P(k) reconstructed from WMAP3 (left) and 
WMAPext (right) data, without CV penalty. While one may be tempted to interpret 
the reconstructed power spectrum as having features, CV shows that they are not 
significant, and the recovered optimal P(k) is that shown in Figs. 2 and 3. The units 
of k are Mpc -1 . 



Table 2. Effect on cosmological parameters of the extra freedom in the primordial 
power spectrum for WMAPext data, in the same format as Table [TJ 



WMAPext 


PL 


run 


spline A op t,ext 


spline A = 


fl b h 2 


0.0223 ±0.00073 


0.021 ±0.001 


0.0221 ±0.00075 


0.018 ±0.0011 


n c h 2 


0.103 ±0.0081 


0.114 ±0.0098 


0.106 ±0.0071 


0.15 ±0.017 


h 


0.739 ±0.031 


0.68 ±0.04 


0.733 ± 0.033 


0.55 ±0.056 


CT8 


0.739 ±0.049 


0.77 ±0.05 


0.764 ±0.042 


0.92 ±0.056 
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Figure 5. Primordial power spectrum P(k) (left) and spectral index n s (k) 
(right) reconstructed from WMAP3+ACBAR08 dataset for the same penalty as the 
WMAPext data set. The WMAP3+ACBAR08 is very consistent with and has very 
similar error-bars to the full WMAPext data set. The units of k are Mpc -1 . 



covariance with WMAP3 while the high I cut is motivated by the "excess" power which 
has been attributed to secondary /foreground effects. We use the same penalty as for 
the other WMAPext runs. We find that f^/i 2 , Q c h 2 and h determinations are virtually 
indistinguishable from those reported in Table El and that the reconstructed P(k) and 
n s (k) are also virtually indistinguishable than those obtained from the full WMAPext 
data set (Fig. EJ. We find a 8 = 0.79 ± 0.04. Note that ACBAR08 has a statistical 
power comparable to the entire set of other high-resolution CMB experiments used in 
WMAPext. 

3.3. Including Large Scale Structure Data 

We implement the n/2-fold CV on the SDSS power spectrum band powers by taking 
every other bandpower. CV shows that while for small penalty the CV score improves 
as penalty is increased, the improvement flattens out at high penalties. Conservatively, 
we use the minimum penalty that gives the flattened-out CV score. This also happens 
to be the optimal penalty for the CMBext datasets (A op t,ext)- 

The reconstructed power spectrum from SDSS main and 2dFGRS are shown in Fig. 
EH (top and bottom panels, respectively). For comparison we report the reconstructed 
P(k) without penalty in Fig. [71 

The cosmological parameter constraints from WMAP3 + large-scale structure are 
reported in Table El This dataset combination shows the same trends as the WMAPext 
data combination. 

Recently, a lot of attention has been given to the value of as'- several cosmological 
observables depend very strongly on this parameter, such as the number density of 
clusters of galaxies and the amplitude of the contribution of the Sunyaev-Zel'dovich 
effect to the CMB power spectrum at mm wavelengths. Tables [U El and [3] show that the 
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Figure 6. Reconstructed power spectrum P(k) (left) and its spectral index n s (k) 
(right) for the WMAP3+SDSS data set (upper panels) and WMAP3+2dFGRS data 
set (lower panels). 




Figure 7. Reconstructed power spectrum P(k) with no penalty for the WMAP3+ 
SDSS (left) and WMAP3+ 2dFGRS (right) data set. 
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Table 3. Effect on cosmological parameters of the extra freedom in the primordial 
power spectrum for WMAP3+SDSS main galaxy sample and WMAP3+2dFGRS data, 
in the same format as Table [T] 



SDSS 


PL 


run 


spline A opt ,oxt 


spline A = 


n b h 2 


0.0223 ± 0.00070 


0.021 ±0.001 


0.0223 ±0.00072 


0.0184 ±0.0011 


n c h 2 


0.132 ±0.0065 


0.139 ±0.0078 


0.125 ±0.0057 


0.14 ±0.012 


h 


0.709 ±0.026 


0.66 ±0.03 


0.664 ±0.023 


0.57 ±0.04 




0.772 ±0.041 


0.783 ±0.041 


0.86 ±0.037 


0.912 ±0.041 


2dFGRS 


PL 


run 


spline A op t,cxt 


spline A = 


n b h 2 


0.0222 ± 0.00070 


0.021 ±0.001 


0.022 ± 0.00074 


0.0203 ±0.0014 


n c h 2 


0.126 ±0.0051 


0.128 ± 0.0055 


0.107 ±0.0050 


0.115 ±0.011 


h 


0.732 ±0.021 


0.703 ±0.026 


0.720 ±0.022 


0.672 ±0.044 


o-s 


0.736 ±0.036 


0.739 ±0.038 


0.776 ±0.037 


0.803 ±0.063 



erg determination from WMAP3 data alone depends very strongly on the assumptions 
about the primordial power spectrum. This can be understood if we consider that 
most of the scales contributing to fluctuations on 8/i _1 Mpc are not directly probed by 
WMAP: an extrapolation is required. These scales are probed by the higher resolution 
CMB experiments and by large-scale structure data; thus as becomes progressively less 
sensitive to assumptions about the power spectrum shape. 

3.4- SDSS Luminous Red Galaxies 

Beyond the power spectrum for the main galaxy sample, SDSS also offers the power 
spectrum of the luminous red galaxies (LRGs). LRGs are more luminous than the main 
sample and thus probe a larger volume: the LRG power spectrum thus has potentially 
greater statistical power. It is, in addition, a very interesting sample to examine because 
many forthcoming and planned dark-energy experiments focus on these galaxies to 
sample even larger survey volumes and measure the baryon acoustic oscillation (BAO) 
signal. [MJ [55] found a tension between the power spectra from the 2dFGRS sample 
and the LRG sample, and concluded that LRGs have a stronger scale-dependent bias 
than blue-selected 2dFGRS galaxies. 

Therefore, we consider the LRG sample separately and explore the recovered 
primordial power spectrum shape. While a full treatment and comparison between 
the DR4 and the DR5 power spectra will be presented in a forthcoming paper, we 
present here a few insights that can be enabled by a non-parametric method. To model 
the effects of redshift space distortions, non-linearities and galaxy biasing, we use the 
empirical form developed in [8]: 

PUk) = b2l J^ P ^ k ) ( 3 ) 

where b denotes a constant scale independent normalization (bias) and A and Q are 
empirical parameters. [8] shows that the value A = 1.4 is robust but that Q depends on 
galaxy type. Thus we leave Q as a free parameter; in particular we do not marginalize 
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Figure 8. Reconstructed power spectrum (left) and spectral slope n s (k) (right) for 
WMAP3+ LRG for the same penalty used for the other LSS data. Comparison with 
Figj6]shows that the LRG sample has less statistical power than the other LSS datasets. 



Table 4. Effect on cosmological parameters of the extra freedom in the primordial 
power spectrum for WMAP3+LRG. The first two columns are from [50] . 



WMAP+LRG 


PL 


run 


spline A opt ,cxt 


WMAP only, A opt ,ext 


n b h 2 


0.0222 ± 0.00070 


0.021 ±0.001 


0.0225 ±0.001 


0.0214 ±0.0059 


n c h 2 


0.105 ±0.004 


0.109 ±0.004 


0.114 ±0.006 


0.107 ±0.008 


h 


0.73 ±0.019 


0.713 ±0.022 


0.68 ±0.06 


0.72 ±0.03 


CT8 


0.756 ±0.035 


0.739 ±0.036 


0.80 ±0.04 


0.77 ±0.036 



over it analytically with a given prior but treat it as an extra MCMC parameter. The 
bias parameter b is treated in the same way. To work in the linear regime we consider 
only scales k < 0.1 h/Mpc, and we use the same penalty as for the other LSS datasets. 
We find that the parameter Q is virtually unconstrained, and that the addition of LRG 
data does not improve constraints on P(k) and n s (k) as much as the main SDSS sample 
or 2dFGRS data (Fig. EJ. 

This is different to the LCDM case: when the shape of the power spectrum is fixed 
the statistical power of the LRG sample is greater than that of the main sample (see 
Tab. H] and [50j [56]). We therefore conclude that a better understanding of the way 
LRG galaxies trace the underlying dark matter distribution is crucial to take advantage 
of the full statistical power of these data. This will be further explored elsewhere (Peiris 
et al. in preparation). 

4. Conclusions 

The latest compilation of cosmological data (e.g., [2]) seems to indicate a significant 
deviation from scale invariance of the primordial power spectrum when parameterized by 
a power law or by a spectral index with a "running" . This deviation serves as a powerful 
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tool to discriminate among theories for the origin of primordial perturbations, such as 
inflationary models. Primordial power spectra described by more complex functional 
forms have also been considered in the literature as described in § (TJ ranging from 
a scale-dependence of the spectral slope ("running") to sharp or oscillatory features 
("glitches"). In interpreting the results of such studies, it is very important to have a 
robust criterion which allows one to determine the optimal smoothness prior to apply 
to the reconstruction technique being used to describe the primordial power spectrum. 
Ideally, in order to minimize mo del- dependence, this criterion should use information 
from the data themselves to determine the number of degrees of freedom needed to 
recover the signal without fitting the noise. 

Here we build on the work of [35] and use a minimally-parametric reconstruction of 
the primordial power spectrum using the cross-validation technique as the smoothness 
criterion. We consider a range of cosmological data WMAP 3-year data, 
complementary data from higher resolution CMB experiments: BOOMERanG, ACBAR 
(including the 2008 data), CBI, VSA, and large-scale structure power spectra from 
2dFGRS and SDSS (both the main and LRG samples). 

When considering WMAP 3- year data alone we find indications, in agreement with 
[321119], that the reconstructed power spectrum loses power at k > 0.02 Mpc -1 compared 
with a power law spectrum. When combining WMAP3 with either higher-resolution 
CMB experiments or large-scale structure data, we find no evidence from a deviation 
from a power law. In fact, the recovered power spectrum gives a spectral slope that is 
scale independent and is characterized by a red tilt, n s ~ 0.96. 

As with all non-parametric methods, this approach, which does not rely on 
parameter estimation, cannot be used to assess the statistical significance of a detection 
of deviation from scale invariance in a straightforward way. Instead, it allows one to test 
the sensitivity of the detection to the parametric form chosen to describe the deviation. 
In this context, we can interpret our findings as follows. In all dataset combinations 
WMAPext and WMAP3+LSS, the best 68% of the spline curves are below or just touch 
the n s — 1 line over 4 or 5 knots depending on the data set; this range corresponds to two 
or more decades in k. Thus naively, assuming one were to connect the knots, one would 
say that the evidence is ~ 2 — 2.5 o . However the curves could be more "wiggly" than 
simply linearly interpolating between the knots, if the data required it. Cross-validation 
shows that the data do not require extra freedom in the primordial power spectrum; 
in addition it shows that the data require a negligible second derivative of P(k) (i.e. 
a power law P(k)). We should interpret this result as confirmation that a power law 
power spectrum is the correct description of the data, offering renewed confidence in 
the n s constraints obtained by such parametric analysis. 

While the spline reconstruction used here is best suited for smooth features in 
the primordial power spectrum, sharp features can also be recovered if they have high 
enough signal-to-noise as illustrated in [35], where a sharp step in the primordial power 
spectrum was shown to be reconstructed. However, the technique implemented here 
would miss features with a characteristic scale much smaller than the knot spacing 
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unless they were highly statistically significant. 

When adding either higher- resolution CMB data or LSS data to WMAP3, we find 
no evidence for deviations (sharp or smooth) from a power law power spectrum. Two 
independent groups [2QI [39] have found persistent features in the primordial power 
spectrum, but see [21J. We suggest that in general, CV techniques could be useful to 
assess the statistical significance of these features. In fact, when not using a penalty in 
our reconstruction, we also find "features" in the power spectrum; these, however, go 
away when using the CV-selected penalty. 

We find that, with the current data compilation, the cosmo logical parameters are 
insensitive to the extra freedom allowed here in the shape of the primordial power 
spectrum, with one exception: cr 8 . The determination of cig from WMAP3 alone is 
significantly affected by assumptions about the primordial power spectrum shape; while 
this sensitivity decreases when adding external datasets which probe smaller scales, 
different data combinations lead to different results for the mean value of this parameter. 
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Appendix 

To demonstrate the insensitivity of the results to the placement of the knots, in Fig. [9] 
we show as an example the reconstruction for knots equally spaced in log k for WMAP3 
only data, which should be compared with Fig. [2j Note that while the reconstruction 
is robust to the choice of the knot positions (as long as the knots fully sample the k 
range), the speed of convergence of the MCMC does depend on their placement. 
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